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ABSTRACT 


The  tactic  of  "relaxation"  has  often  been  used  in  one  guise 
or  another  in  order  to  cope  with  mathematical  programs  with  a  large 
number  of  constraints,  some  or  all  of  which  may  be  only  implicitly 
available.  By  "relaxation"  we  mean  the  solution  of  a  given  problem 
via  a  sequence  of  smaller  problems  that  are  relaxed  in  that  some  of 
the  inequality  constraints  are  temporarily  ignored.  Relaxation  has 
been  used  primarily  in  the  context  of  linear  programming,  but  in  this 
paper  we  examine  a  version  that  is  valid  for  a  general  class  of  con¬ 
cave  programs.  Constraints  are  dropped  as  well  as  added  from  relaxed 
problem  to  relaxed  problem.  A  specialization  to  the  completely  linear 
case  is  shown  to  be  equivalent  to  Lemke’s  IXial  Method.  This  result 
permits  some  pertinent  inferences  to  be  drawn  from  the  extensive  com¬ 
putational  experience  available  for  the  (primal)  Simplex  Method. 

Other  matters  pertaining  to  computational  efficacy  are  discussed.  An 

interpretation  of  relaxation  in  terms  of  the  dual  in  the  nonlinear 
also 

case  is/ established.  The  optimal  multipliers  generated  by  successive 
relaxed  problems  turn  out  to  comprise  a  sequence  of  improving  feasible 
solutions  to  the  minimax  dual.  When  interpreted  in  this  way,  it  be¬ 
comes  apparent  that  relaxation  corresponds  to  just  the  opposite  tactic— 
which  we  call  "restriction"— applied  to  the  dual  problem.  Restriction 
is  an  equally  interesting  and  useful  tactic  in  its  own  right,  and  its 


main  features  are  outlined. 
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I.  INTRODUCTION 

Quite  often  an  optimization  problem  with  some  inequality  con¬ 
straints  possesses  one  or  more  of  the  following  properties: 

(1)  prior  knowledge  is  available  concerning  which  of  the  con¬ 
straints  might  be  active  at  an  optimum  solution; 

(2)  there  are  so  many  constraints  that  the  dimension  limits 
of  coded  algorithms  for  available  computers  are  exceeded; 

(3)  some  of  the  constraints  are  available  only  implicitly, 
and  can  be  generated  in  explicit  form  only  at  substantial 
expense. 

Property  (l)  may  hold  when  a  variant  of  the  problem  has  been 
solved  before,  or  when  the  problem  is  amenable  to  physical  or  mathe¬ 
matical  insight.  Property  (2)  is  the  seemingly  ubiquitous  bane  of  prac¬ 
tical  applications.  And  property  (3),  usually  in  conjunction  with  pro¬ 
perty  (2),  is  a  frequent  consequence  of  mathematical  manipulations  of  a 
more  natural  problem  formulation. 

For  problems  such  as  these  a  rather  obvious  ,,relaxation,,  tac¬ 
tic  comes  to  mind  for  use  in  conjunction  with  any  algorithm  that  would 
be  applicable  were  it  not  for  properties  (2)  or  (3):  solve  a  relaxed 
version  of  the  given  problem  that  ignores  some  of  the  inequality  con¬ 
straints;  if  the  resulting  solution  satisfies  all  01  the  ignored  con¬ 
straints  then  it  must  be  optimal  in  the  original  problem,  but  otherwise 
generate  and  include  one  or  more  violated  constraints  in  the  relaxed 
problem  and  reoptimize  it;  continue  to  generate  and  add  violated  con¬ 
straints  in  this  fashion  until  the  original  problem  has  been  solved^. 

^G.B.  Dantzig  [6]  is  largely  responsible  for  popularizing  this  tactic  in 
the  context  of  linear  programming.  He  called  it  "the  method  of  additional 
restraints"  for  handling  "secondary  constraints".  See  also  Thompson, 

Tonge  and  Zionts  [25]. 
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Ibis  tactic  seems  quite  promising  if  (as  is  usually  the  case)  only  a 
fairly  small  proportion  of  all  inequality  constraints  is  actually 
binding  at  an  optimal  solution  of  the  original  problem,  provided  that 
reasonably  efficient  mechanisms  are  available  for  Identifying  violated 
constraints  and  reoptimizing  the  relaxed  problems.  A  useful  improve¬ 
ment  involves  dropping  amply  satisfied  constraints  of  the  relaxed  pro¬ 
blem  from  time  to  time,  but  this  must  be  done  so  as  not  to  destroy  the 
Inherent  finiteness  of  the  procedure. 

It  is  interesting  to  observe  that  Lemke’s  Dual  Method  [17] 
can  be  interpreted  as  a  procedure  for  implementing  the  improved  tactic 
within  the  specialized  context  of  linear  programming.  Curiously  this 
conspicuous  interpretation  seems  never  to  have  been  explicitly  stated 
and  proved  in  the  subsequent  literature,  although  it  is  certainly  part 

of  the  "folklore"  of  linear  programming  and  ha3  been  used  in  one  form 

2 

or  another  by  several  authors  .  As  a  result  of  this  gap  in  the  liter¬ 
ature,  it  would  seem  that  the  pedagogy  and  even  development  of  mathe¬ 
matical  programming  has  suffered  unnecessarily.  Primal-mot ivated 
methods  using  such  tactics ^  and  dual  methods  are  rarely  exhibited  in 
their  proper  relation  to  one  another,  and  it  has  seldom  been  recognized 
that  computational  experience  with  variants  of  truly  primal  methods 
tells  us  something  about  the  behavior  of  methods  using  corresponding 
variants  of  relaxation.  The  purpose  of  this  note  is  to  help  smoothe 
over  this  hiatus  in  the  literature. 

See  Balinskl  [l],  Charnes,  Cooper  and  Miller  [4],  Gomiry  [12],  and 
Qomory  and  Hu  £l3]  • 

^See,  e.g.,  Benders  [3],  Cheney  and  Goldstein  [5],  Dentzig,  Fulkerson 
and  Johnson  l7l>  Gomcry  [ll],  Kelley  [l6],  Ritter  [19],  Stone  [23], 
and  Van  Slyke  and  Wets  [26]. 
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In  sec.  II  we  formally  state  a  version  of  relaxation  that 
permits  constraints  to  be  dropped  from,  as  well  as  added  to,  the  relaxed 
problems.  Termination  in  a  finite  number  of  iterations  is  easily  shown  for  a 
general  class  of  concave  programs.  In  sec.  Ill  we  establish,  under  a 
non-degeneracy  assumption,  that  in  the  completely  linear  case  a  speci¬ 
alization  of  the  relaxation  tactic  is  equivalent  to  Lemke's  Dual  Method. 
Matters  pertaining  to  computational  efficacy  are  discussed  in  the 
following  section.  A  number  of  inferences  are  drawn,  with  the  help  of 
the  result  of  sec.  Ill,  from  available  computational  experience  with 
variants  of  the  Simplex  Method.  In  the  fifth  and  final  section  we 
establish  an  enlightening  Interpretation  of  relaxation  in  terms  of  the 
dual  problem  in  the  nonlinear  case.  It  turns  out  that  the  optimal 
multipliers  generated  by  successive  relaxed  problems  comprise  a  sequence 
of  improving  feasible  solutions  to  the  minimax  dual.  When  interpreted 
in  this  way,  it  becomes  apparent  that  relaxation  applied  to  the  (original) 
primal  problem  corresponds  to  Just  the  opposite  tactic— which  we  call 
"restriction"— applied  to  the  dual.  Restriction  1b  an  equally  inter¬ 
esting  and  useful  tactic  In  its  own  right,  and  we  conclude  the  paper 
with  an  outline  of  its  main  features. 


h. 


II.  STATEMENT  AND  FROOF  OF  THE  RELAXATION  TACTIC 


Let  f,  g^, be  concave  functions  on  a  non-empty  convex 
set  X  c  Rn,  and  define  M  ■  (1,2, .  .*,m).  The  problem 

(P)  Maximize  v  f(x)  subject  to  g. (x)  >0,  i  «  M 

x  e  a  1  — 


will  be  converted  to  a  finite  sequence  of  smaller  problems  of  the  form 


(PQ)  Maximize  v  f(x)  subject  to  g.(x)  >0,  1  e  S  c  M. 

a  x  c  a  l  —  “ 

Assume  that  a  subset  S°  is  known  such  that  (Pco)  admits 
S°  s° 

an  optimal  solution  x  (with  f(x  )<<»),  and  assume  further  that 
(Pg)  admits  an  optimal  solution  whenever  it  admits  a  feasible  solution 
and  its  maximarvi  is  bounded  above  on  the  feasible  region.  For  these 
assumptions  to  hold  it  la  of  course  sufficient,  but  not  necessary, 
that  X  be  compact  and  all  functions  continuous  (one  may  enforce 
boundedness,  if  necessary,  by  using  a  "regularization"  artifice). 

Under  these  assumptions,  we  shall  show  that  the  following  tactic  is 
well-defined  and  terminates  in  a  finite  number  of  steps. 

Relaxation 

Step  0;  Put  f  •  »  and  S  ■  S°,  where  S°  is  any  subset  of  M 
such  (Pso)  admits  a  finite  optimal  solution. 

g 

Step  1:  Solve  (Pg)  for  an  optimal  solution  x  if  one  exists; 

if  none  exists  (l.e.,  (Pg)  is  infeasible),  then  terminate 

with  the  message  "(P)  infeasible".  If  g^(xS)  >  0  for  all 

S 

1  e  M-S,  terminate  with  the  message  x  is  an  optimal 


solution  of  (P)";  otherwise,  go  to  Step  2. 
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Step  2:  Put  v  equal  to  any  subset  of  M  that  Includes  at 

4  S  s  — 

least  one  constraint  violated  by  x  .  If  f(x  )  <  f, 

replace  S  by  E  U  v,  where  E  »  (i  c  S:  g^(x  )  ■  0)> 
and  f  by  f(x3);  otherwise  (i.e.,  if  f(xS)  «  f), 
replace  S  by  S  U  v,  Return  to  Step  1. 

This  tactic  simply  goes  from  one  relaxed  problem  to  the  next 
by  adding  at  least  one  constraint  that  is  violated  at  an  optimal  solu¬ 
tion  of  the  current  relaxed  problem,  while  deleting  the  amply  satisfied 
constraints  so  long  as  the  value  of  the  objective  function  is  decreasing. 
Eventually  a  relaxed  problem  is  encountered  that  is  either  infeasible, 
in  which  case  (P)  obviously  must  be  infeasible,  or  has  an  optimal 
solution  that  is  also  feasible  in  (P),  in  which  case  that  solution  ob¬ 
viously  must  solve  (P). 

To  show  that  the  relaxed  problems  which  arise  are  either  in¬ 
feasible  or  admit  an  optimal  solution,  in  view  of  our  assumptions  it  is 
enough  to  show  inductively  that  the  sequence  <A  is  non-increasing, 

s  s 

where  f  is  the  supremum  of  the  maximand  of  (Fg)  (let  f  -  -09  if 
(Pg)  is  infeasible).  Certainly  f3^7  <  f3 ,  and  f*^7  <  f®  .  We 
assert  that  f^  *  f^,  which  yields  the  desired  monotonicity  of  <f^>. 

Hiis  assertion  is  an  easy  consequence  of 

s  s 

Lemma  1.1:  Let  x  be  optimal  for  (P„).  If  g.(x  )  >  0  , 

3  J 

s  ,  . 

where  J  e  S  ,  then  x  is  also  optimal  for  (P_  J. 

O-J 

Proof:  Certainly  f3"^  >  f(xS).  Suppose  that  fS_^  >  f(xS).  lfcen 

Note  the  vide  latitude  in  the  choice  of  v.  A  common  choice  of  v  is 
to  make  it  the  index  of  the  most  violated  constraint,  but  many  other 
choice  criteria  are  possible.  See  the  discussion  of  sec.  IV. 


6. 


g 

there  exists  a  point,  x'  feasible  in  (Pa  .)  such  that  f(x')  >  f(x  ). 

b-J 

We  may  assume  g.(x')  <  0,  or  else  x*  would  contradict  the  optimality 
J 
g 

of  x  in  (l\).  By  the  concavity  of  f  and  the  g, ,  i  e  S,  and  the 
convexity  of  X,  it  follows  that  for  X  positive  but  sufficiently 
small  the  point  Xx'  +  (l-X)x^  is  feasible  in  (Pg).  But  then 
f(Xx'+(l-X)xS)  >  Xf(x')  +  (l-X)  f(xS)  >  f(xS),  which  contradicts  the 
definition  of  x  .  Hence  f  f(x' ),  and  x  must  be  optimal  for 

^S-J^* 

Thus  far  we  have  shown  that  the  tactic  is  well-defined  and 

3 

that  the  sequence  <f(x  S  is  non-increasing.  Since  Step  2  only  de¬ 
letes  amply  satisfied  constraints  from  S  (before  adding  v)  when 

q 

f(x  )  has  Just  decreased,  it  follows  from  the  finiteness  of  the 

O 

number  of  possible  subsets  of  M  that  <f(x°)>  can  remain  constant 
for  only  a  finite  number  of  consecutive  iterations.  Again  appealing 
to  the  finiteness  of  the  number  of  possible  trial  sets,  we  see  that 
finite  termination  is  established. 

Theorem  1,1:  Relaxation  terminates  in  a  finite  number  of 
steps  with  either  (a)  an  optimal  solution  of  (P),  or  (b) 
the  identification  of  a  subset  of  the  constraints  of  (P) 
that  are  collectively  infeasible  over  X.  Moreover,  in  case 

g 

(a)  a  non-increasing  sequence  <f(x  )>  of  upper  bounds  on 
the  optimal  value  of  (P)  is  obtained. 

It  is  worth  emphasizing  again  at  this  point  that  relaxation 
ia  a  tactic  and  no  more*  It  is  not  a  computational  procedure 

for  solving  (P)  until  it  is  applied  in  conjunction  with  an  algorithm 
for  solving  the  relaxed  problems  (Pg).  However,  let  not  its  utter 
simplicity  in  the  mathematical  sense  belie  its  usefulness. 
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Other  variants  of  this  tactic  for  convex  programming  have 
been  given  by  Geoffrion  [8  and  93j  Oettli  [l8],  Sethi  [21],  and  Takeuti 
[24],  See  also  Cheney  and  Goldstein  [53  for  an  application  and  proof 
of  similar  tactics  to  problems  with  an  infinite  number  of  constraints 
(cf.  footnote  9)* 


III.  RELATION  TO  THE  DUAL  METHOD 
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The  fact  that  a  feasible  solution  of  (P)  is  not  obtained 

g 

until  the  final  step,  and  that  <f(x  )>  is  monotone  decreasing  to  the 
optimal  value  of  (P),  suggests  the  adjective  "dual"  in  describing  re¬ 
laxation.  In  this  section  we  6hall  show  that  relaxation  can  be  special¬ 
ized  in  a  natural  way  so  as  to  be  equivalent  to  Lemke's  Dual  Method  [17] 
when  (p)  is  a  linear  program^.  A  more  general  dual  interpretation  is 
suggested  in  the  final  section. 

Let  f(x)  =  cx,  g^x)  *  M  =  (1,  ...,n},  and  X  =  {x:Ax  «b) 
hold  for  (P),  where  a  1j»  x  n.  The  Dual  Method  is 

initiated  with  some  set  B°  of  variables  designated  as  "basic"  which 
yields,  from  the  "reduced  costs"  of  an  associated  tabular  representation 
of  (P)  (see  below),  a  feasible  solution  of  the  dual  to  (P).  Assuming 
that  the  successive  feasible  solutions  to  the  dual  are  non-degenerate, 
we  shall  prove 

Theorem  2.1;  If  S°  is  taken  as  M-B°  and  v  always  as 
the  most  violated  constraint,  then  the  set  of  non-basic  vari¬ 
ables  at  the  iteration  of  the  Dual  Method  coincides 

t/h  th 

with  E  at  the  u  iteration  of  relaxation  and  the  ^ 

til  s 

basic  solution  coincides  with  the  u*  x  . 

It  is  necessary  to  give  a  brief  rendering  of  the  Dual  Method 
in  order  to  establish  the  notation  used  in  the  proof.  More  complete 
details  may  be  found,  for  example,  in  [15]  or  [17]. 

Problem  (P)  can  be  restated  as  one  of  maximizing  z  subject 
to  x  >  0  and  the  following  equality  constraints  stated  as  a  tableau 
(m^+1  by  nt<?)  of  detached  coefficients: 

^See  also  Beale's  "method  of  leading  variables"  [2],  developed  independ¬ 
ently  but  nearly  equivalent  to  Lemke's  EUal  Method." 
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At  any  given  iteration  there  is  specified  a  collection  B  of 

where  Ag  is  formed  by  ex¬ 
tracting  columns  from  A  according  to  B,  and  such  that  c  =  ((  .  -1)  a 

..  -  c  >  0,  where  Cg  is  similarly  formed  by  extraction  according  to 
B.  Moreover,  the  equality  constraints  are  re-expressed  as: 


z  x  *  1 


T 

(CBAg1)A  -C 

CBABlfa 

0 

i 

& 

_ 

basic  variables  such  that 


£ 


exists, 


If  t>  =  A^  b  >  0,  then  it  is  easily  shown  that  an  optimal  solution  of 
(P)  is  at  hand:  put  x^  =  0  for  j  non-basic  and  the  basic  variable 
Xg  (corresponding  to  the  i*'*1  row)  equal  to  b^.  If  b  £  0,  then  let 

be  the  most  negative  component  (actually,  any  negative  component 
will  do)  and  test  make  sure  that  at  least  one  component  arj  of  the 
matrix  A^1  A  is  negative  for  some  non-basic  J  (if  none  is  negative, 
it  can  be  shown  that  (P)  is  iifeasible).  Let  k  be  defined  so  that 


— —  =  Maximum  { 

a  i 
rk 


rj 


:  J  non-basic  and  arj  <  0  ), 
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and  pivot  on  the  element  to  obtain  the  detached  coefficient  array 

corresponding  to  the  new  set  of  basic  variables  (B  -  Br  +  k)  (x^  is 
called  the  "entering, "  and  x^  the  "exiting”  basic  variable).  If 
>  0  then 

is  also  non -negative.  The  assumption  of  dual  non-degeneracy  means  that 
Cj  >  0  for  all  non-basic  J  at  each  iteration,  and  can  always  be  en¬ 
forced  by  arbitrarily  small  perturbations  of  the  problem  data. 

We  are  now  in  a  position  to  make  three  key  observations  about 
the  Dual  Method.  The  first  can  be  found  essentially  in  Charnes,  Cooper 
and  Miller  [4,  p.  TfcfJ, 

Lenina  2.1:  At  any  iteration  of  the  Dual  Method,  the  current 
basic  solution  is  the  unique  optimal  solution  of  (Pg)  with 
£  equal  to  the  current  set  of  non-basic  variables  (S  =  M-B). 
Proof:  The  current  basic  solution  is  certainly  feasible  in  (PM_B)* 

To  show  that  it  is  optimal,  by  the  IXial  Theorem  of  linear  programming 
it  suffices  to  display  a  feasible  solution  to  the  dual  of  (pM_B)  with 
the  same  value  of  the  objective  function.  One  has  only  to  verify,  using 
c  >  0,  that  (c^A^1)  is  such  a  dual  solution.  Uniqueness  of  the 
optimal  solution  of  (P..  follows  from  the  assumed  non-degeneracy  of 
the  dual. 

Lemma  2.2:  If  the  IXial  Method  terminates  because  (P)  is  in¬ 
feasible  (i.e.,  if  b  <  0  and  a  .  >  0  for  all  non-basic  J 
at  some  tableau),  then  (Pg),  with  S  equal  to  the  current 
set  of  non-basic  variables  plus  B^,  is  infeasible. 

Proof:  By  the  Dual  Theorem  of  linear  programming,  it  is  enough  to  show 
that  the  dual  of  (Pg)  is  feasible  and  has  an  unbounded  optimum.  It 


v£Lb 


strictly  decreases,  and  in  any  event  the  new 
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may  be  verified  that  (c^A^)  +  9(^^)r  *  where  (A^)r  *s 

row  of  k^  (br  <  0),  is  feasible  in  the  dual  for  all  0  >  0  and 
achieves  an  arbitrarily  small  value  of  the  dual  objective  as  0  -» 

Lemma  2,3:  At  any  non-terminal  iteration  of  the  Dual  Method, 
if  x^  is  the  entering  basic  variable  then  x^  >  0  in  the 
next  basic  solution. 

Proof;  The  definition  of  the  pivot  operation  implies  that  x^  *  (b^/a^j 


in  the  next  basic  solution.  By  selection,  bf  <  0  and  afk  <  0, 

Proof  of  Th,  2.1:  The  proof  proceeds  by  induction  on  »  At  u  “  1> 

S°  has  been  taken  as  M-B°,  the  initial  set  of  non-basic  variables, 

gO  gO 

Lemma  2.1  assorts  that  (Pco)  has  a  unique  solution  x  .  Hence  x 

S  S» 

must  be  the  initial  basic  solution.  Since  x^  =  0  by  definition  for 

all  non-basic  J,  E  =*  S°,  Thus  the  assertion  is  true  for  «j  »  1, 

Assume  that  the  assertion  is  true  for  the  j  iteration 

o 

of  the  IXial  Method,  Either  the  iteration  is  terminal  because 

(P)  has  been  solved,  or  is  terminal  because  (P)  has  been  found  to 
be  infeasible,  or  is  not  terminal.  In  the  first  case,  relaxation  also 
terminates  with  an  optimal  solution  of  (p).  In  the  second  case,  by 
Lemma  2.2  the  next  relaxed  problem  encountered  is  infeasible  and 
therefore  terminal.  Consider  now  the  third  case.  We  shall  show  that 
the  assertion  of  the  theorem  holds  at  the  next  iteration  by  detailing 
the  operation  of  relaxation  starting  at  Step  2  of  the  current  iteration. 

g 

Dual  non -degeneracy  implies  that  f(x  )  decreases  strictly 
at  each  iteration.  Hence  the  trial  set  to  be  used  at  the  ( U{+l)st 
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Iteration  of  relaxation  is  E  U  B  ,  where 

r 

g 

component  of  the  current  x  .  It  follows 


Xg  is  the  most  negative 
r 

from  Lemmas  2.1  and  2.3  that 


(  P 

'  E  UB^)  has  a  unique  solution,  and  that  all  components  indexed  by 

E  UB  vanish  in  this  solution  except  for  x.  ,  which  is  strictly 
r  K 

positive.  Hie  assertion  of  the  theorem  now  follows  immediately. 
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IV.  COMPUTATIONAL  EFFICACY 

Let  us  turn  now  to  questions  concerning  computational  efficacy. 
We  have  already  mentioned  in  sec.  I  three  properties  which  strongly  en¬ 
courage  -  if  not  demand  -  the  use  of  tactics  euch  as  the  present  one. 

But  this  is  not  to  say  that  computational  success  will  necessarily  be 
achieved  if  these  properties  hold.  Computational  success  probably  de¬ 
pends  more  on  the  following  three  conditions: 

(a)  only  a  fairly  small  proportion  of  all  inequality  con¬ 
straints  should  actually  be  binding  at  an  optimal  solu¬ 
tion  of  (P); 

(b)  a  reasonably  efficient  mechanism  must  be  available  for 
identifying  suitable  violated  constraints  given  a  trial 
solution  of  (P); 

(c)  a  reasonably  efficient  mechanism  must  be  available  for 
reoptimizing  the  reduced  problem  at  each  iteration. 

That  condition  (a)  often  holds  has  frequently  been  mentioned 
(and  exploited  via  similar  tactics)  for  various  special  problem  classes; 
for  example,  by  Dantzig,  Fulkerson  and  Johnson  [7]  for  a  linear  pro¬ 
gramming  equivalent  of  the  traveling  salesman  problem,  by  Dantzig  [6] 
for  oil  refinery  problems,  by  Charnes,  Cooper  and  Miller  [4]  for 
bounded  variables  and  warehouse-type  problems,  and  by  Van  Slyke  and  Wets 
[26]  for  optimal  control  and  stochastic  programming  problems. 


In  fact  it  is  easy  to  see  that  condition  (a)  will  always  hold  for 
linear  programs  with  a  large  number  of  inequality  constraints  relative 
to  the  number  of  structural  variables,  since  an  optimal  solution  occurs 
at  an  extreme  point  of  the  feasible  region^.  The  same  line  of  reasoning 
does  not  apply  to  nonlinear  problems,  but  from  curvature  considerations 
and  common  experience  it  appears  that  condition  (a)  holds  even  more 
strongly  than  in  the  linear  case. 

Condition  (b)  is  least  troublesome  if  the  inequality  con¬ 
straints  are  reasonable  in  number  and  explicitly  available,  for  then 
there  is  no  difficulty  in  implementing  any  reasonable  criterion  for 
the  choice  of  v.  Most  commonly  v  is  taken  to  be  the  most  violated 
constraint,  but  many  other  criteria  are  possible.  There  is  little 
theoretical  or  empirical  evidence  to  distinguish  these  criteria  from 
one  another  in  terms  of  relative  effectiveness.  In  view  of  the  result 
of  sec.  Ill,  however,  we  can  perhaps  draw  some  tentative  inferences 

based  on  experience  with  the  purely  linear  case.  Extensive  experiments 
(e.g.,  127]) 

have  been  carried  out/comparing  alternative  rules  for  selecting  pivotal 
columns  in  the  usual  Simplex  Method  for  linear  programming.  This  is 
actually  the  same  as  comparing  analagous  rules  for  choosing  a  singleton 
v  with  relaxation  applied  to  the  dual  problem.  Results 

indicate  that  while  the  "most  violated  constraint"  rule  may  not  be 
best  in  terms  of  minimizing  the  number  of  required  iterations,  other 
plausible  rules  can  be  expected  to  be  consistently  better  by  no  more 

^It  should  be  noted  that  one  could  argue  (cf.  Smith  and  Orchard -Hays 
[22]  and  Stone  [23])for  the  usefulness  in  linear  programming  of  tactics 
of  the  present  sort  even  in  the  absence  of  condition  (a). 
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than  a  factor  of  two  or  so.  An  example  of  a  somewhat  better  rule 
is  the  so-called  "greatest-change”  rule,  which  for  the  present  tactic 
amounts  to  choosing  i  in  M-S  to  maximize  the  decrease  in  the  optimal 
value  of  the  next  relaxed  problem.  Unfortunately  such  a  rule  is  likely 
to  be  expensive  to  implement  for  a  nonlinear  problem.  Choosing  v  to 
be  the  most  violated  constraint  typically  leads*  in  the  linear  case*  to 
a  number  of  iterations  equal  to  about  twice  the  number  of  variables. 
Results  pertinent  to  the  choice  of  v  when  more  than  one  constraint 
index  is  allowed  are  available  from  experiments  with  the  "suboptimi¬ 
zation"  tactfc  [27,  p.  190j.lt  was  observed^,  for  example,  that  taking 
v  to  consist  of  the  five  most  violated  constraints  reduced  the  number 
of  iterations  by  a  factor  of  two  as  compared  with  the  single  most 
violated  constraint  rule.  Of  course  this  increases  the  amount  of  com¬ 
putation  required  to  solve  each  vdlaxed  problem*  but  with  the  product 
form  of  the  Simplex  Method  there  is  a  significant  net  benefit  in  terms 
of  total  computing  time.  It  is  not  known  to  what  extent  experience 
such  as  this  for  the  linear  case  is  a  useful  guide  for  the  choice  of 
v  in  the  nonlinear  case. 

Condition  (b)  is  more  troublesome  when  the  constraints  are 
vast  in  number  or  only  implicitly  available.  In  this  case  concern  over 
the  best  criterion  for  the  choice  of  v  is  often  all  but  irrelevant, 
since  none  but  the  simplest  criteria  can  be  implemented  at  reasonable 
computational  cost.  Sometimes  only  a  few  violated  constraints  are  in¬ 
expensively  available  each  time  the  relaxed  problem  is  solved*  and  it 

7 

'Again  we  invoke  a  dual  interpretation  of  the  primal  algorithm. 
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is  indicated  that  they  he  used  whether  or  not  they  satisfy  any  global 
criterion.  This  is  the  case,  for  example,  with  Dantzig,  Fulkerson  and 
Johnson's  problem  (7 3, with  Gomory's  integer  programming  algorithms 
ll2,  p.  133],  and  with  Kelley's  cutting-plane  method  [l6]  .  On  other 
occasions  one  can  implement  the  "most  violated  constraint"  criterion  by 
solving  a  subsidiary  optimization  problem  (or  several  smaller  subsidiary 
problems,  should  special  structure  cause  the  constraints  to  partition 
naturally  into  several  groups).  This  was  the  prescription  of  Cheney 
and  Goldstein  [5]  in  most  of  their  algorithms^  .  Fbr  Benders  [3j  the 
subsidiary  problem  took  the  form  of  a  linear  program,  and  for  Gomory 
and  Hu  [13]  it  took  the  form  of  network  flow  problems.  For  other  special 
structures  the  subsidiary  problem  of  finding  the  most  violated  constraint 
might  assume  the  form  of  a  dynamic  program,  or  an  integer  program,  etc.; 


To  formally  view  Kel3ey%  method  as  an  instance  of  relaxation  one  may 
represent  the  relevant  portion  of  the  set  (xig^x)  >  0,  iCK)  in  (P) 

by  the  intersection  of  an  infinite  number  of  containing  half-spaces. 

(P)  can  thus  be  written  as  the  problem  of  maximizing  f(x)  subject  to 
x  in  X  and  G(x')  +  y  f  (x-x')  >  0  for  all  x'  in  X  such  that 

G(x')  <0,  where  G(x)  =  Min  (g^  (x),...,  g^x))  and  Vx,  is  a  sub¬ 
gradient  of  G  at  x'  (if  g^  (x')  =  G(x')  and  g^^  is  differenti- 

0  0 

able,  then  ode  can  take  y  »  as  the  gradient  of  .  at  x1).  Kelley's 
choice  of  v  corresponds  to  the  constraint  G(x  )  +  \^S(x-x  )  >  0. 

^See  especially  algorithms  I,  II,  and  IV.  The  minimand  of  each  problem 
is  the  supremum  of  a  collection  of  linear  functions.  To  view  this  work 
in  the  present  context,  each  minimand  should  be  expressed  as  a  collection 
of  constraints  using  an  additional  variable  and  the  least  upper  bound  de¬ 
finition  of  a  supremum.  An  interesting  historical  sidelight  mentioned  by 
Cheney  and  Goldstein  is  that  the  roots  of  their  algorithms,  and  hence  of 
relaxation,  date  back  to  E.  Remez's  work  on  polynomial  approximation  pub¬ 
lished  in  193^. 
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examples  of  such  algorithms  are  readily  available  (see  Balinski  Cl J  and 
Gomory  Il2])if  we  interpret  primal  algorithms  with  column-generating 
techniques  as  dual  algorithms  for  the  dual  problem  with  row-generating 
techniques.  Gilmore  and  Gomory  IlO,  p.  877]  have  reasoned  along  these 
lines  to  establish  a  connection  between  their  computational  experience 
with  the  cutting-stock  problem  and  previous  experience  with  Gomory 's 
integer  programming  algorithms.  They  conclude  that  in  large  iinear  pro¬ 
gramming  problems  computation  times  are  likely  to  be  long  or  erratic 
when  v  is  a  singleton  chosen  more  or  less  blindly  from  the  violated 
constraints,  as  opposed  to  the  choice  of  v  as  the  most  violated  con¬ 
straint. 

Condition  (c)  is  met  by  the  usual  post-optimality  techniques 
for  adding  additional  constraints  if  (P„)  is  a  linear  program.  Biese 

O 

techniques  typically  involve  an  iteration  or  two  of  the  Dual  Method, 
although  they  can  be  viewed  in  purely  primal  terms  (consider  the  addi¬ 
tional  constraints  as  functions  to  be  maximized  until  their  values  reach 
O)10.  The  latter  view  is  an  appropriate  one  to  use  in  the  general  non¬ 
linear  case,  since  it  leads  to  a  fairly  easy  modification  of  most  primal 
nonlinear  programming  algorithms  applicable  to  (Pg)  and  takes  advan¬ 
tage  of  the  availability  of  a  feasible  and  optimal  solution  to  the  pre¬ 
vious  relaxed  problem. 


^Alternatively,  one  can  parametrically  deform  (in  any  of  several  ways) 
each  relaxed  problem  into  the  next  one. 
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V.  DUAL  INTERPRETATION:  RESTRICTION 
The  utter  simplicity  of  relaxation  certainly  makes  its  in¬ 
terpretation  in  terms  of  (P)  completely  transparent.  In  the  purely 
linear  case  there  is  no  difficulty  interpreting  relaxation  in  terms  of 
the  dual  of  (P)  as  veil,  since  the  Dual  Method  amounts  to  the  ordi¬ 
nary  Simplex  Method  applied  to  the  dual  problem.  In  this  section  we 
establish  an  interpretation  of  relaxation  in  terms  of  the  dual  of  (P) 
for  the  nonlinear  case.  It  is  of  considerable  interest  that  relaxation 
applied  to  (P)  corresponds  to  a  "restriction"  tactic  applied  to  the 
dual.  Restriction  is  a  useful  tactic  in  its  own  right,  with  a  rationale 
and  Justification  paralleling  that  of  relaxation-  in  many  respects. 

The  natural  dual  problem1^  associated  with  (P)  is 

(D)  Minimize  {  sup  f  (x)  +  £  g.  (x)  ]. 

X^^  >0,  ieM  xeX  ieM 

We  assert  that  the  sequence  of  optimal  multipliers  associated  with  the 
gj^  constraints  of  the  successive  relaxed  problems  —  which  can  be 
guaranteed  to  exist  under  various  mild  qualifications  --  constitutes  a 
sequence  of  improving  feasible  solutions  to  (d).  Let  us  denote  the 
optimal  multipliers  for  (Pg)  by  X^  ,  ieS.  Since  X i  is  necessarily 
non-negative,  the  feasibility  of  these  multipliers  in  (D)  is  inmediate 

s 

(take  X^  *0  for  icM-S).  By  the  saddlepoint  condition  character- 

s  s 

izing  (x  ,X  ),  moreover,  we  have 

f(xS)  =  f(x)  +  2  gjU). 

ieM 

^A  thorough  discussion  of  modern  nonlinear  duality  theory  is  given  by 
Rockefeller  [20].  V'e  assume  here  at  least  a  passing  familiarity  with 
the  main  concepts  and  results. 
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That  <  X*>  is  an  improving  sequence  of  feasible  solutions  of  (d)  is 

g 

thus  confirmed  by  the  result  of  sec.  II  that  <  f  (x  )  >  is  a  non¬ 
increasing  sequence. 

This  result  leads  one  to  ask  whether  there  is  a  natural  rationale 
for  relaxation  when  viewed  as  a  method  for  solving  the  dual  problem, 
ftie  answer  is  affirmative:  relaxation  applied  to  (P)  amounts  to  a 
"restriction"  tactic  applied  to  (D).  To  explain  this  assertion,  it 
will  be  more  enlightening  to  explain  "restriction"  as  applied  to  (P) 
rather  than  to  (d).  The  reader  can  then  understand  our  assertion  in 

g 

light  of  the  fact  that  g^  (x  )  plays  the  role  of  the  dual  varJable 

s  s 

associated  with  X i  in  (d);  it  can  be  shown  that  (g^(x  ^...jg^Cx  )) 

g 

is  a  subgradient  of  the  minimand  of  (d)  evaluated  at  X  (defined  as 
above  for  5cM), 

Let  us  now  briefly  consider  restriction,  the  opposite  of  re¬ 
laxation,  in  the  context  of  (P).  Again  (P)  is  converted  into  a  se¬ 
quence  of  simpler  problems,  but  now  the  simpler  problems  are  restricted 
instead  of  relaxed.  Each  is  of  the  form 


Maximize 

xeX 


f(x)  subject  to  g^x)  =  0,  i  e  S  c  M 
gi(x)  >  0,  i  e  M-S, 


where  S  is  a  subset  of  the  constraint  indices.  In  order  that  (Qg) 
should  be  a  concave  program  we  require  g^  to  be  linear  for  i  e  S; 
so  we  may  as  well  assune  that  g^^  is  linear  for  all  i  e  M  by  in¬ 
corporating  any  nonlinear  constraints  into  X.  Usually  the  constraints 
g^x)  >  0,  i  e  M,  will  include  the  customary  non-negativity  constraints 
(g^x)  *  x^).  In  this  case  the  variables  indexed  by 


on  the  variables 
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S  vanish  in  (Qg);  thus  if  S  is  populous  relative  to  M,  (Qg)  is 
much  more  tractible  than  (P). 

Assume  that  (Qg)  admits  an  optimal  solution  whenever  it  is 
feasible  and  its  maximand  is  bounded  above  on  the  feasible  region,  and 
that  in  this  event  optimal  multipliers  associated  with  the  con¬ 

straints  are  available.  Then  the  following  tactic  is  well-defined  and 
terminates  in  a  finite  number  of  steps. 

Restriction 


Step  0  Put  f  =  -  *  and  S  =  S°,  where  S°  is  any  subset 
of  M  such  that  (Qgo)  is  feasible  (such  a  subset 
fails  to  exist  if  and  only  if  (p)  is  infeasible). 

g 

Step  1  Solve  (Qg)  for  an  optimal  solution  x  (if  the 

maximand  of  (Qg)  is  unbounded  above,  then  the  same 

is  obviously  true  of  (P)).  If  the  optimal  multipliers 
S 

^i  associated  with  constraints  g^(x)  *  0  (i  e  S) 
are  all  non-negative,  then  terminate  with  the  message 

g 

"x  is  an  optimal  solution  of  (P)";  otherwise,  go 
to  Step  2. 

Step  2  Put  v  equal  to  any  subset  of  3  that  includes  at 

3 

least  one  constraint  in  S  for  which  ^  <  0.  If 

s  — 

f(x  )  >  f  ,  replace  S  by  S  U  E-v  where  E  = 

(i  e  M-S:  gi(x^)  =  0};  otherwise  (i.e.,  if  f(xS) 

=  f),  replace  S  by  S-v  .  Return  to  Step  1. 
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Note  that  constraints  can  enter  the  restricted  set  S,  as 

g 

well  as  leave  it,  so  long  as  f(x  )  is  increasing.  Clearly  each  trial 

s  s 

solution  x  is  feasible  in  (P),  and  <  f(x  )  >  is  a  non-decreasing 
sequence. 

It  is  easy  to  see  that  an  appropriate  specialization  of  this 
tactic  to  the  linear  case  is  equivalent  to  the  ordinary  Simplex  Method. 
The  set  S  then  corresponds  at  each  iteration  to  the  current  non-basic 
variables. 

The  circumstances  in  which  restriction  is  an  appealing  tactic 
are  precisely  those  mentioned  in  sec.  I,  if  we  read  "variables"  for 
"constraints"  in  the  three  properties  of  (?)  mentioned  there.  For 
example,  restriction  is  appealing  when  the  number  of  variables  is  very 
great,  or  when  the  problem  data  corresponding  to  many  of  the  variables 
are  available  only  implicitly  unless  substantial  expense  is  incurred. 
The  circumstances  in  which  restriction  is  likely  to  be  computationally 
effective  are  analagous  to  those  discussed  in  the  previous  section  for 
relaxation.  The  analogue  of  condition  (b),  for  instance,  is  that  there 
must  be  a  reasonably  efficient  mechanism  at  Step  2  for  identifying 
variables  in  S  whose  corresponding  multipliers  are  negative.  Indeed, 
this  is  exactly  what  "column-generation"  schemes  for  large-scale  linear 
programming  are  all  about.  See  the  surveys  by  Balinski  [l]  and  Gomory 
[12]  for  lucid  discussions  of  such  schemes. 

That  restriction  is  most  highly  developed  in  the  context  of 
linear  programming  should  not  obscure  its  applicability  in  the  non¬ 
linear  case.  An  outstanding  example  of  the  use  of  restriction  for 
large  structured  nonlinear  programs  is  Rosen’s  convex  partition  pro¬ 
gramming  algorithm  (in  [l4]),  where  it  is  used  subsequent  to  a 
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"partitioning"  of  the  variables. 

Restriction  and  relaxation,  although  opposites  of  one  another, 
are  by  no  means  incompatible.  The  author  has  indicated  elsewhere  ( [8] 
and  [9])  how  the  two  tactics  can  be  employed  simultaneously.  The  re¬ 
duced  p-oblems  then  become  simpler  still  than  (P^)  or  (Q^),  but  assurance 
of  finite  termination  requires  somewhat  more  intricate  control.  The 

12 

computational  advantages  of  such  a  combined  approach  can  be  .-dramatic  • 


12 

A.M.  Geoffrion,  "Constrained  Maximum  Liklihood  Estimation  of  Several 
Stochastically  Ordered  Distributions,"  The  RAND  Corporation,  forthcoming. 
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IS  abstract 


The  tactic  of  "relaxation"  has  often  been  used  in  one  guise  or  another  in 
order  to  cope  with  mathematical  progress  with  a  large  maber  of  constraints, 
soma  or  all  of  which  may  be  only  implicitly  available.  By  "relaxation"  we 
mean  the  solution  of  a  given  problem  via  a  sequence  of  sasdlsr  problems  that 
are  relaxed  in  that  some  of  the  inequality  constraints  are  temporarily  ignored 
Relaxation  has  been  used  primarily  in  the  context  of  linear  programing,  but 
in  this  paper  ve  examine  a  version  that  is  valid  for  a  general  -Teas  of  con- 
cave  programs*  Constraints  are  dropped  as  veil  as  added  from  relaxed  problem 
to  relaxed  problem.  A  specialisation  t>  the  cosqpletely  linear  case  is  shown  to 
be  equivalent  to  Leake's  Dual  Method.  Ibis  result  permits  some  pertinent  in¬ 
ferences  to  be  drawn  from  the  extensive  computational  experience  available 
for  the  (primal)  Simplex  Method.  Other  matters  pertaining  to  computational 
efficacy  are  discussed.  An  interpretation  of  relaxation  in  terms  of  the  dual 
In  the  nonlinear  case  is  also  established.  The  optimal  multipliers  generated 
by  successive  relaxed  problaau  turn  out  to  comprise  a  sequence  of  improving 
feasible  solutions  to  the  minimax  dual.  When  Interpreted  in  this  way,  it 
becomes  apparent  that  relaxation  corresponds  to  Just  the  opposite  tactic  — 
which  ve  call  "restriction"  —  applied  to  the  dual  problem.  Restriction  is 
an  equally  interesting  and  useful  tactic  in  its  ovn  right,  and  its  main 
features  are  outlined. 
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